NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / sys / amiga / programmer / 6812 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 11.0 KB

Path: news.NetVision.net.il!news From: Jack <avilev@netvision.net.il> Newsgroups: comp.sys.amiga.programmer Subject: Re: 680X0 -> PPC translator? Date: Wed, 03 Apr 1996 07:23:56 -0700 Organization: NetVision LTD. Message-ID: <3162980C.2003@netvision.net.il> References: <31499F8E.26A9@netvision.net.il> <volker.0fw1@vb.franken.de> <315800D7.1854@sapiens.com> <volker.0g32@vb.franken.de> <315C198B.49C2@netvision.net.il> <volker.0g5w@vb.franken.de> NNTP-Posting-Host: ts006p2.pop4a.netvision.net.il Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 2.0b6a (Win16; I) Volker Barthelmann wrote: > > Jack (avilev@netvision.net.il) wrote: > : Hi Volker, well i'll try to keep the lines shorter this time > : so you'll be able to read the text more conveniently and who > > Thanks! > > : know. you might actually be convinced that 680x0 -> PPC is possible. ;-) > > I doubt that. :-) > > : > Perhaps You can perform this analysis. An algorithm can't. > : > : why the hell not, if i can understand assembly source code, > : the machine can understand machine code right?! > > I don't know if You (or any human) can really 'understand' EVERY > piece of machine code. Of course typical assembly source can be > understood by humans, because it was written with that in mind. > you don't have to understand what a program does in order to convert its code from one instruction set to another. all you look for are specific things and provided you know the op-codes of the 2 processors there should be no problem whatsoever. > : > Of course You can have a structure holding all Your variables. Now You pass > : > the address of this structure to an external function that writes some > : > values into it and You lost. > : > : i don't seem to care about that, get it through your head, i'm NOT > : going to intervene with what the program does with any memory area, > : just as long as this memory area is not later being used for code > > Unfortunately You have to know about all other memory areas as well, > because otherwise You cannot determine what is code and what is data. not necessarily, all static code is inside the program already, all there's to do is follow the logic of the program keeping track of what memory areas are being used AND how. now don't be confused again, you don't need actual run-time pointer values in order to know how a memory is being used, you use its 'symbol' ie pointer variable to represent it. if you follow though the entire program, never mind its logic, you can know which parts (within the exe) are code or data, any part which is jmp'ed to is code otherwise it's data. > > Think about: The program writes a value somwhere. Then copies it around, > shifts it, moves it again etc. and than sometimes it reads it from where > it is now, loads it in a0 and does a jmp (a0) or so. oh god, how many times will i have to go through this. it doesn't matter at all because i don't care if that area is jmp'ed directly or indirectly if i find out that some memory is used as a code area, i go back and trace where it was initially assigned and then start following the changes made to it, making sure to change the size of the area it points to so that the PPC code will fit nicely. really it's that simple and if you don't believe me then i'll make my point more concrete for ya, try looking at some assembly source which does what you just said, YOU can know (if you follow the flow) the exact area a certain pointer is pointing at, at a given point in the program, you can also figure out how it's being used and you can also figure out what needs to be changed for the PPC translation to work. if you can do it, bet an algorithm can too. > > : execution. then, and only then will i have to turn to all the locations > : where that pointer could have been assigned and then change the size > : argument which i already know to the appropriate sizes, once i figure > : out that the area is code, i will perform a well defined series of actions > : to resolve ALL dependecies related to that area, including size, write loops etc. > > Well, please define those 'well defined' series of actions. as soon as i find out the pointer is a CODE pointer i do the following: 1) trace back in the program where that pointer was assigned. 2) decide whether the area it's pointing to is static (meaning within one of the program's segments) or a dynamic one. 3) if (2) == dynamic then calculate 'source' code size and look it up withing one of the arguments just before the call/jmp is made. the argument will be an immediate or stored in some variable, the point is it's INSIDE the one of the program's segments and it's a REAL VALUE. change the size according to the translated code size. (i'm assuming the code has already been chewed up and spitted out) 4) is (2) == static then increase the size of the hunk it's located in. 5) find the last write loop just before the call was made and change the counter's end value condition, changing the move instruction to move bytes. 6) go on happily to other parts of the program. HUH, i wrote it, lets see your response to that Volker. flame me good this time, ok? :-) > > : > There are several memory allocating functions or other functions You can't > : > know anything about. > : > : then you take into account all of the various SYSTEM functions which allocate > : memory (there aren't many of them) and proceed as normal. > > What do You do if a program does an OpenLibrary("some_custom.library",foo); > and a jsr -some_strange_offset_I've_never_seen_before(a6)? > This function could call AllocMem. You don't even know what parameters it > takes etc. AHHHHH, that is where you're wrong (again, teehee), here's that phrase again, 'keep track', i know i know, this term is without a doubt overused in my articles, but hey, this is WAR, any means can be used to achieve the target, PPC dominance. now, if you save your stack status before every call is made, you can know which parameters on the stack belong to that function. for example if the stack contained: A,B,C before and now D,E,F are pushed and then there's a call to some routine then you know D,E,F are its arguments don't you. bear in mind that the external library is PPC translated already, any code-modification tricks it does won't need any changes. ofcourse, __regarg functions (in C) don't use the stack for all parameters i know, and some assembly programs like to do register passing of arguments well, then in that case it's truely more diffilcult but not impossible. if the call is to some C RTL function, then no sweat, i can go directly to the function and actually see how it uses the register in its code and dicsern the call prototype and make the necessary adjustments. if it's a DOS library things might get a little complicated, but could be solve by some educated guess as to what needs to be changed. > > : > You would have to adjust EVERYTHING that is in any way dependant on the code > : > size! How are You going to do this? > : > : all you have to look for is where it was allocated and where it is copied, > : assuming you're copying code that is. quite easily done if you follow the > : change made to that pointer while 'running' through the program's logic. > > If You write a program that only reads a normal assembly source file that > is known to copy some piece of code somewhere and Your program can change the > source to copy one byte more, I'll believe You (and call You god, if You want). well, if i'll have the time and energy i will, but 1st i have to write the analyser program to do that. that might take some time. > Please tell me more, I'm curious! well, the project analysed the information flow in some organization and tried to analyse the bottle-necks in that flow which ultimatly caused financial losses to that company. i won't go into many details of how it's done cuz you'll have to have background in Automated-data-processing theory in order to comprehand the terms used and what i'm talking about. suffice to say that by the end, the project involved an ingenius highly automated and controlled network of computers running software which dealt specifically with the problems at hand, which increased productivity in the short term and profits in the longer term. > Yes, You have. E.g. the allocator could assume that all requests are multiples > of 1024 and rely on that. Now the original code may have been a multiple > of 1024, but the PPC-code probably isn't. So when You pass the PPC-code-size > to the allocator it will go nuts. so what??? the area will later be used in some loop copying the code, right?! it'll have to use actuall byte/word count to do that. you look into the allocation 'prototype' and try to find out that value. if you can't find it, then you're probably right, it uses some multiple of some size, in that case the minimum multiple it would have to be is calculated by unit = actual_source_size/num_units_requested; then you use that unit size in the calculation of the PPC code size and you're done. of course there'll be some memory wastes but who cares about that. > > : granted, you're right about that not all solutions are clear cut but when it comes > : to exact sciences such as computer-science then i would disagree, especially > : since the problem we're dealing with isn't related to any human logic, that > : field even in computer-science hasn't been fully explored and understood yet. > : no, the only major problem is self-modification of the program otherwise you > : translate the code 1:1. > > Still I claim You can't even reliably decide what is code and what is data. > Even if self-modifying code is forbidden. that's absurd, if a human can follow an assembly source code and can know which parts are code and which ones are data so can a fucken algorithm. > Assume a program that has some kind of keyfiles. It has n areas that could > be code or data and Your algorithm has to decide that. > Assume that the addresses of those areas are in an array adr[n]. > Now the program call the system function Open("env:keyfile",MODE_OLDFILE), > reads all longwords from the file, adds them up and calls adr[sum%n]. > To know which areas could be called You would have to know all valid > keyfiles. Of course no algorithm knows them and therefore can't decide what > is code. qed. if that's your example of a real value, think AGAIN. this can easily be solved by going through all the various hunks of the program trying to find CODE sections. when you find a code instruction that 'makes sense' mark the section's entry point. if somebody else calls it you know it's code for sure, then you translate it. you go through all such code sections and understand when and where they're being used making sure to translate only code sections actually being used. if a sequence of data words reveals code it might actually not be code so you have to defer the translation until you're sure about it. chances are it's code, if more than 2-3 instructions exists in sequence that probably is code and can thus be translated. if it was just random data, then nothing happened, we just randomised it again. better think harder, now flame me. Avi Lev.